Emotional speech classification using adaptive sinusoidal modelling
نویسندگان
چکیده
Automatic classification of emotional speech is a challenging task with applications in synthesis and recognition. In this paper, an adaptive sinusoidal model (aSM), called the extended adaptive Quasi-Harmonic Model eaQHM, is applied on emotional speech analysis for classification purposes. The parameters of the model (amplitude and frequency) are used as features for the classification. Using a well known database of narrowband expressive speech (SUSAS), we develop two separate Vector Quantizers (VQ) for the classification, one for the amplitude and one for the frequency features. It is shown that the eaQHM can outperform the standard Sinusoidal Model in classification scores. However, single feature classification is inappropriate for higher-rate classification. Thus, we suggest a combined amplitude-frequency classification scheme, where the classification scores of each VQ are weighted and ranked, and the decision is made based on the minimum value of this ranking. Experiments show that the proposed scheme achieves higher performance when the features are obtained from eaQHM. Future work can be directed to different classifiers, such as HMMs or GMMs, and ultimately to emotional speech transformations and synthesis.
منابع مشابه
Classification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملEvaluation of Hidden Semi-Markov Models Training Methods for Greek Emotional Text-to-Speech Synthesis
This paper describes and evaluates four different HSMM (hidden semi-Markov model) training methods for HMM-based synthesis of emotional speech. The first method, called emotion-dependent modelling, uses individual models trained for each emotion separately. In the second method, emotion adaptation modelling, at first a model is trained using neutral speech, and thereafter adaptation is performe...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملPhase modelling of speech excitation for low bit-rate sinusoidal transform coding
Sinusoidal transform coding (STC) techniques model speech as the sum of sine-waves whose frequencies, amplitudes and phases are specified at regular intervals. To achieve a low-bit rate representation, only the spectral envelope is encoded and the phases are regenerated according to a minimum phase assumption. In this paper, the inaccuracy of the minimum phase model is demonstrated. It is shown...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014